-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🌱 Add separate concurrency flag for cluster cache tracker #9116
🌱 Add separate concurrency flag for cluster cache tracker #9116
Conversation
022f1fb
to
6700138
Compare
With marking it as deprecated, the old flag won't be visible anymore on
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One nit/question, otherwise looks good to me, thanks
6700138
to
cbf78d4
Compare
Sorry about forgetting to push my commit 🤦 ☕ |
cbf78d4
to
fb958c0
Compare
Restarted the linter |
fb958c0
to
807e0eb
Compare
/lgtm /hold |
LGTM label has been added. Git tree hash: 5d527676520673ea1c4a4f289c53fd2cdfe6108d
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sbueringer The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/area clustercachetracker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/hold cancel |
@@ -134,6 +135,9 @@ func InitFlags(fs *pflag.FlagSet) { | |||
fs.IntVar(&kubeadmControlPlaneConcurrency, "kubeadmcontrolplane-concurrency", 10, | |||
"Number of kubeadm control planes to process simultaneously") | |||
|
|||
fs.IntVar(&clusterCacheTrackerConcurrency, "clustercachetracker-concurrency", 10, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clustercachetracker is an implementation detail that's now being exposed to users, could we find a name like child-cluster-concurrency
or workload-cluster-concurrency
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't think of a better name that expresses what we do :). So I thought it's a good idea to use something that expresses ~ the name of the controller.
We can use something like workload-cluster-concurrency but it seems misleading to be honest. It's not actually the concurrency of how many workload clusters we can reconcile. Also all our clusters are workload clusters.
Is something like clustercache-concurrency better? It doesn't point directly to CCT but it still expresses somewhat that it's the concurrency of the cache we use for clusters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vincepri wdyt about dropping the flag and just hard-coding to 10? 10 should be enough anyway so there is no need for a flag.
I just wanted to avoid that the cct is using a lot of workers if someone needs more workers for the regular cluster reconcilers
@@ -135,6 +136,9 @@ func initFlags(fs *pflag.FlagSet) { | |||
fs.IntVar(&concurrency, "concurrency", 10, | |||
"The number of docker machines to process simultaneously") | |||
|
|||
fs.IntVar(&clusterCacheTrackerConcurrency, "clustercachetracker-concurrency", 10, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this isn't set, could we actually default to concurrency
? It makes some sense to keep these in sync
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think usually you wouldn't use the same number of workers. Usually you need more workers for a controller when they otherwise can't keep up with the load. This mainly depends on reconcile duration.
Looking at the Reconcile func of ClusterCacheReconciler, it will always return almost instantly. So even with >10k Clusters I don't think we need more than 10 workers. (With 1 ms reconcile duration, 10 workers can reconcile 10k clusters in 1s)
In fact the only reconciler that I had to run with more than 10 workers with 2k clusters was KubeadmControlPlane (because it had reconcile durations of ~ 1-2 seconds). In that case I used 50 workers.
So I think while we would have to increase the concurrency of the Cluster controllers at some point to above 10. The ClusterCacheReconciler would still be fine with 10
What this PR does / why we need it:
Adds a separate flag
--clustercache-tracker-concurrency
to allow adjusting the concurrency of the created cluster cache tracker to all relevant executables: CAPI, CAPBK, KCP, CAPD.This came up while adding the clustercache tracker to CAPV in this discussion.
Before this PR the other concurrency flags got re-used to also set the concurrency for the clustercache tracker.
Example from @sbueringer why this change makes sense:
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #